{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Animation\n",
    "\n",
    "Animations can be an effective way to show an extra dimension of your data, especially changes over time. Clearly, they're not appropriate everywhere, eg in static media! For various reasons, they should be used sparingly, but-used rarely and with care-they can convey a message about data more effectively than a static chart ever would. Perhaps the most famous example of an animated chart is the legendary Hans Roslings' [gapminder animation](https://youtu.be/jbkSRLYSojo?t=29) covering 200 countries and 200 years of progress in income and health.\n",
    "\n",
    "The most common scenario for wanting to create an animation is sharing on social media (gifs are ubiquitous on the internet) or displaying an animation of some of your results during a talk. In both these cases, creating a gif is probably the best and safest option. 'The best' because although video files tend to be very large (and thus difficult to share), gifs can be turned into small files, with a bit of tweaking. 'The safest' because even though presentation software like Microsoft's Powerpoint can crumble when trying to play an embedded video file, gifs usually work just fine. And while video formats differ across operating systems, all of them will play gifs (all you need to play a gif is a browser such as Safari or Chrome).  However, you can also create animations in other video formats, such as .mp4. The main downside of gifs is that they can produce quite jerky videos.\n",
    "\n",
    "There are several ways to produce animations in Python, which we'll come to in a moment. First though, let's import some of the general packages we'll need: "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "import pandas as pd\n",
    "import numpy as np\n",
    "from pathlib import Path\n",
    "\n",
    "# Settings\n",
    "# Set max rows displayed for readability\n",
    "pd.set_option(\"display.max_rows\", 6)\n",
    "# Plot settings\n",
    "plt.style.use(\n",
    "    \"https://github.com/aeturrell/coding-for-economists/raw/main/plot_style.txt\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Plot To Animate \n",
    "\n",
    "First, we need a plot to animate. And what better dataset than the gapminder data? \"There is none!\" I hear you shout at your computer. But hang on, perhaps we'd better see a simple demonstration of some of the principles we'll be using first.\n",
    "\n",
    "A video is a sequence of static images stitched together and replayed quickly, like a flickbook cartoon. We will use the same trick, a series of static images played after one another, to create animated charts. But this means we need to create the series of static images.\n",
    "\n",
    "Let's create individual images and index them with $i$. We want to move on to a new scene every time we increment $i$. We do this by creating a *function* that accepts $i$ as a parameter and that modifies what is shown on the image. The below example shows how it can work: although the $x$ and $y$ arrays are always defined, and an $x$-$y$ line plot is always shown, the `ax.scatter` function is passed only the $i$th element of the $x$ and $y$ arrays. When we call `plot_scene` with different values of $i$, the scene updates to the next moment in time. In the example below, the bead moves along 1/20th of its journey between 0 and $2\\pi$.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def plot_scene(i):\n",
    "    fig, ax = plt.subplots(figsize=(3, 3), dpi=100)\n",
    "    # Create an array\n",
    "    x = np.linspace(0, 2 * np.pi, 20)\n",
    "    # Create the y function\n",
    "    y = np.sin(x)\n",
    "    ax.plot(x, y)\n",
    "    ax.scatter([x[i]], [y[i]])\n",
    "    ax.set_title(f\"Scene {i}\")\n",
    "\n",
    "\n",
    "plot_scene(0)\n",
    "plot_scene(1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Okay, so that's the basic idea: a function of the scene that we can move forward through time. Later, we'll stitch together the independent images. For now, let's turn back to the gapminder data.\n",
    "\n",
    "The gapminder data are from the excellent [Our World in Data](https://ourworldindata.org/) website, but they've been filtered a bit for this book. Let's load them up:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df = pd.read_csv(\n",
    "    \"https://github.com/aeturrell/coding-for-economists/raw/main/data/owid_gapminder.csv\"\n",
    ")\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Brilliant, that's our data. But now we need a plot, with a few dimensions of data, and a way to progress that plot through time.\n",
    "\n",
    "Let's first setup the end phase of our plot, where we'll eventually get to. This will be based on the final year in the data."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "There are five dimensions displayed in the plot we're recreating:\n",
    "\n",
    "- continent, via colour\n",
    "- income (GDP per capita), on the x-axis\n",
    "- life expectancy, on the y-axis\n",
    "- population, as bubble size\n",
    "- year, as the time dimension\n",
    "\n",
    "We have almost all of those columns already, but we do need to convert the list of *continents* to a list of *colours*. **matplotlib** has a built-in colour cycler, that we can use for this purpose. It is accessed via `plt.rcParams['axes.prop_cycle'].by_key()['color']`. To map the continents into colours, we'll use a dictionary combined with the `zip` function that 'zips' two lists together (if the lists are unequal it takes the shorter and as there are only five continents in our data but ten colours, this will happen)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(plt.rcParams[\"axes.prop_cycle\"].by_key()[\"color\"])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "colour_mapping = dict(\n",
    "    zip(df[\"Continent\"].unique(), plt.rcParams[\"axes.prop_cycle\"].by_key()[\"color\"])\n",
    ")\n",
    "df[\"colour\"] = df[\"Continent\"].map(colour_mapping)\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Okay, so we have all we need for our plot! Now we need to make a function that plots this when passed an index that gets mapped into a year. Note that while we have every year of data here, that may not be true in general so it's always safer to map an index to the years that *are* available using a dictionary like so:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "step_to_yr_map = dict(zip(range(len(df[\"Year\"].unique())), df[\"Year\"].unique()))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def gapminder_at_year(i, gap_df):\n",
    "    \"\"\"Plots the gapminder data for a given step\"\"\"\n",
    "    if type(i) != int:\n",
    "        return ValueError(\"i must be an integer.\")\n",
    "    # Map steps into years\n",
    "    step_to_yr_map = dict(\n",
    "        zip(range(len(gap_df[\"Year\"].unique())), gap_df[\"Year\"].unique())\n",
    "    )\n",
    "    if i not in step_to_yr_map.keys():\n",
    "        return ValueError(\"i is beyond range of number of years.\")\n",
    "    # Filtered version of data with only the relevant year\n",
    "    gdf_yr = gap_df[gap_df[\"Year\"] == step_to_yr_map[i]]\n",
    "    plt.close(\"all\")\n",
    "    fig, ax = plt.subplots(figsize=(4, 3), dpi=150)\n",
    "    ax.scatter(\n",
    "        x=gdf_yr[\"GDP per capita\"] / 1e3,\n",
    "        y=gdf_yr[\"Life expectancy\"],\n",
    "        c=gdf_yr[\"colour\"],\n",
    "        s=gdf_yr[\"Population\"] / 1e6,\n",
    "        alpha=0.8,\n",
    "        edgecolor=\"k\",\n",
    "    )\n",
    "    ax.set_xlim(0, 100)\n",
    "    ax.set_ylim(20, gap_df[\"Life expectancy\"].max() * 1.2)\n",
    "    ax.set_xlabel(\"GDP per capita (1000s 2011 USD)\")\n",
    "    ax.set_ylabel(\"Life expectancy (years)\")\n",
    "    ax.set_title(f\"Gapminder: {step_to_yr_map[i]}\")\n",
    "    plt.show()\n",
    "\n",
    "\n",
    "# Test this out at the first step\n",
    "gapminder_at_year(0, df)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We now have everything we need to create the individual scenes in our animation."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Creating Animations Using the **Gif** Package\n",
    "\n",
    "The [**gif**](https://github.com/maxhumber/gif) package is here to make it as easy as possible to create gifs from a series of plot images created using **matplotlib**, **altair**, or **plotly**. (In principle, this should also work for **plotnine** as it's built on top of **matplotlib**). To install it, use `pip install gif` on the command line.\n",
    "\n",
    "**gif** works by using function decorators (you can learn more about these in the advanced coding chapter). These 'decorate' existing functions using an `@` symbol and change their behaviour. In this case, the decorator is `@git.frame` and it tells **gif** to remember this function as one frame of a gif.\n",
    "\n",
    "Note that in the example below, we'll make things a bit less high resolution and brief than we might optimally do, just to save on some file space.\n",
    "\n",
    "A few tips:\n",
    "\n",
    "- use a high resolution when creating the gif, as high as 900 dots per inch. In the below we use a quite large 400 dots per inch to save a bit of space but it still gives a hefty file size. However...\n",
    "- ...you can use an optimizer to then shrink that finished high res gif down to a more respectable size. In the below, [**pygifsicle**](https://github.com/LucaCappelletti94/pygifsicle) is used to do this, a package that you will need to install separately. In our example, this takes the gif down to under 1MB.\n",
    "- repeate the last frame multiple times; it allows the viewer to see where your animation ended up and gives a break before looping around again.\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```python\n",
    "import gif\n",
    "\n",
    "@gif.frame\n",
    "def gapminder_at_year(i, gap_df):\n",
    "    \"\"\" Plots the gapminder data for a given step\n",
    "    \"\"\"\n",
    "    if(type(i) != int):\n",
    "        return ValueError(\"i must be an integer.\")\n",
    "    # Map steps into years\n",
    "    step_to_yr_map = dict(zip(range(len(gap_df[\"Year\"].unique())),\n",
    "                              gap_df[\"Year\"].unique()))\n",
    "    if(i not in step_to_yr_map.keys()):\n",
    "        return ValueError(\"i is beyond range of number of years.\")\n",
    "    gdf_yr = gap_df[gap_df[\"Year\"] == step_to_yr_map[i]]\n",
    "    plt.close('all')\n",
    "    fig, ax = plt.subplots(figsize=(4, 3), dpi=400)\n",
    "    ax.scatter(x=gdf_yr[\"GDP per capita\"]/1e3, y=gdf_yr[\"Life expectancy\"],\n",
    "               c=gdf_yr[\"colour\"], s=gdf_yr[\"Population\"]/1e6,\n",
    "               alpha=0.8, edgecolor='k')\n",
    "    ax.set_xlim(0, 100)\n",
    "    ax.set_ylim(20, gap_df[\"Life expectancy\"].max()*1.2)\n",
    "    ax.set_xlabel(\"GDP per capita (1000s 2011 USD)\")\n",
    "    ax.set_ylabel(\"Life expectancy (years)\")\n",
    "    ax.set_title(f\"Gapminder: {step_to_yr_map[i]}\")\n",
    "    plt.tight_layout()\n",
    "\n",
    "# How many iterations do we want?\n",
    "total_num_years = len(df[\"Year\"].unique())\n",
    "# Produce the frames. Here, I'll only do the last 20 years to make the gif small\n",
    "frames = [gapminder_at_year(i, df) for i in range(total_num_years-20, total_num_years)]\n",
    "# A trick: repeat the last frame a few more times to let people see where the animation ends\n",
    "frames = frames + [gapminder_at_year(total_num_years-1, df) for i in range(10)]\n",
    "# Save the animation\n",
    "gif.save(frames, Path(\"img/gapminder.gif\"), duration=200, unit=\"ms\", between=\"frames\")\n",
    "# Optimize the animation to make it smaller (and therefore easier to share)\n",
    "from pygifsicle import optimize\n",
    "optimize(Path(\"img/gapminder.gif\"))\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![Gif of shortened gapminder sequence.](https://github.com/aeturrell/coding-for-economists/raw/main/img/gapminder.gif)\n",
    "\n",
    "And there we have a (shortened and not-as-high-res-as-optimally) gapminder sequence! Can you spot the financial crisis?\n",
    "\n",
    "A higher resolution run would smooth out that slightly roughened text. Now you've seen how it works, you can really run quite free with animations because, if you can draw it as a series of frames with matplotlib (and related libraries), plotly, or altair, you can animate it.\n",
    "\n",
    "\n",
    "```{admonition} Exercises\n",
    "1. Animate the bead on a sine curve we saw earlier using the **gif** package.\n",
    "2. Animate the complete gapminder sequence, from first year of data to 2018.\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Making Animations Using the **imagio** Package\n",
    "\n",
    "[**imagio**](https://imageio.readthedocs.io/en/stable/) is an alternative package for making animations, and works in a slightly different way, with an intermediate step between creating the static images and weaving them into a video. However, it allows you to stitch together jpg files as well as png files.\n",
    "\n",
    "Let's imagine that you have a folder for temporary files called 'scratch' and a directory for finished gifs called 'img'. The code below gives an example of how you would use **imageio** to create 20 frames of a video and then weave them together into a .gif:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```python\n",
    "import imageio\n",
    "\n",
    "\n",
    "def frame(i):\n",
    "    plt.close('all')\n",
    "    x = np.linspace(0, 2*np.pi, 20)\n",
    "    y = np.sin(x)\n",
    "    fig, ax = plt.subplots(figsize=(2, 2), dpi=900)\n",
    "    ax.plot(x, y, zorder=1, lw=0.5)\n",
    "    ax.scatter([x[i]], [y[i]], zorder=2)\n",
    "    ax.set_title(f\"Scene {i}\")\n",
    "    # Outputs files of name \"frame000.png\" etc., but you could use .jpg\n",
    "    plt.savefig(Path(f'scratch/frame{i:0>3d}.png'))\n",
    "\n",
    "# Create the frames; this creates files within the 'scratch' directory\n",
    "[frame(i) for i in range(20)]\n",
    "\n",
    "file_names = list(Path(\"scratch\").glob(\"*.png\"))\n",
    "file_names.sort()\n",
    "\n",
    "# Now create the gif in the img directory:\n",
    "writer = imageio.get_writer(Path(\"img/sin_bead.gif\"), mode=\"I\", fps=4)\n",
    "for img_file in file_names:\n",
    "    image = imageio.imread(img_file)\n",
    "    writer.append_data(image)\n",
    "\n",
    "from pygifsicle import optimize\n",
    "optimize(Path(\"img/sin_bead.gif\"))\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![Gif of point on sine curve made with imagio](https://github.com/aeturrell/coding-for-economists/raw/main/img/sin_bead.gif)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**imagio** can also write to .mp4 files if needed, and has a host of other features not covered here.\n",
    "\n",
    "## See Also\n",
    "\n",
    "There are some excellent other tools out there. The command line tool [**ffmpeg**](https://www.ffmpeg.org/) is worth a look if you need fancier video options than are provided by the two tools we've met.\n",
    "\n",
    "[**Matplotlib**](https://matplotlib.org/stable/api/animation_api.html) has a built-in animation feature too, though historically it has been a bit more tricky to use than tools like **imagio** or **gif**.\n",
    "\n",
    "[**ahlive**](https://ahlive.readthedocs.io/en/latest) is a really promising package that is still in early development. It integrates directly with **pandas** and has some easy-to-use features for making sophisticated looking animations in just a few lines of code.\n",
    "\n",
    "Finally, the master of animation in Python-but with a *very* steep learning curve-is [**manim**](https://3b1b.github.io/manim/index.html), which makes extremely professional-looking mathematical animations. As an example, take a look at [this video on Fourier Series](https://www.youtube.com/watch?v=r6sGWTCMz2k&ab_channel=3Blue1Brown) that was created with **manim**."
   ]
  }
 ],
 "metadata": {
  "interpreter": {
   "hash": "a723953a921d3d66a26057d9d3b423b845ee567ba5efd2daaf047bd46dbfed26"
  },
  "kernelspec": {
   "display_name": "Python 3.8.6 64-bit ('codeforecon': conda)",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
